{"id":19782441,"url":"https://github.com/steadylearner/sitemap","last_synced_at":"2025-04-30T22:30:56.786Z","repository":{"id":39496566,"uuid":"185997361","full_name":"steadylearner/Sitemap","owner":"steadylearner","description":"You can visit https://www.steadylearner.com/sitemap.xml or sitemap.txt","archived":false,"fork":false,"pushed_at":"2023-01-04T01:17:21.000Z","size":1754,"stargazers_count":7,"open_issues_count":20,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-04-14T13:09:42.097Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://www.steadylearner.com","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/steadylearner.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-10T13:56:27.000Z","updated_at":"2020-07-03T16:28:00.000Z","dependencies_parsed_at":"2023-02-01T17:16:44.978Z","dependency_job_id":null,"html_url":"https://github.com/steadylearner/Sitemap","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steadylearner%2FSitemap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steadylearner%2FSitemap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steadylearner%2FSitemap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steadylearner%2FSitemap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/steadylearner","download_url":"https://codeload.github.com/steadylearner/Sitemap/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224225286,"owners_count":17276435,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T06:05:09.577Z","updated_at":"2024-11-12T06:05:10.185Z","avatar_url":"https://github.com/steadylearner.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!--\n    Post{\n        subtitle: \"Build sitemap automatically with Rust\",\n        image: \"post/sitemap/automate-sitemap-rust.png\",\n        image_decription: \"Image by Steadylearner\",\n        tags: \"Rust How sitemap automate\",\n    }\n--\u003e\n\n\u003c!-- Link --\u003e\n\n[Rust Sitemap Crate]: https://github.com/svmk/rust-sitemap\n[Steadylearner]: https://www.steadylearner.com\n[Steadylearner Github Repository]: https://github.com/steadylearner/Steadylearner\n[How to deploy Rust Web App]: https://medium.com/@steadylearner/how-to-deploy-rust-web-application-8c0e81394bd5?source=---------9------------------\n[Rust Diesel]: http://diesel.rs/\n[Sitemap in React]: https://medium.com/@steadylearner/how-to-build-a-sitemap-for-react-app-7bbc3040dc1f\n[Steadylearner Sitemap]: https://github.com/steadylearner/Sitemap\n[What is image sitemap]: https://support.google.com/webmasters/answer/178636\n\n[futures-timer]: https://github.com/rustasync/futures-timer\n[timers-tokio]: https://tokio.rs/docs/going-deeper/timers/\n\n\u003c!-- / --\u003e\n\n\u003c!-- Steadylearner Post --\u003e\n\n[sitemap]: https://www.steadylearnerc.com/blog/search/sitemap\n[sitemap.xml]: https://www.steadylearner.com/sitemap.xml\n[txt sitemap]: https://www.steadylearner.com/sitemap.txt\n[image sitemap]: https://www.steadylearner.com/image_sitemap.xml\n\n\u003c!-- / --\u003e\n\nIn this post, We will make functions to build **sitemap.txt**, **sitemap.xml** and **image_siemap.xml** for images and make them reusable.\n\nYou can use them later with thread. Then, include them inside interval to automate the process without affecting the main function. You may use it with CLI also.\n\n\u003cbr /\u003e\n\n\u003ch2 class=\"red-white\"\u003e[Prerequisite]\u003c/h2\u003e\n\n1. [Rust Sitemap Crate]\n2. [What is image sitemap]\n3. [What is sitemap](https://support.google.com/webmasters/answer/156184?hl=en)\n4. [How to build a sitemap](https://www.google.com/search?client=firefox-b-d\u0026q=how+to+build+sitemap)\n5. [Futuers in Rust](https://docs.rs/futures/0.2.3-docs-yank.4/futures/)\n6. [Thread in Rust](https://doc.rust-lang.org/std/thread/)\n7. [futures-timer]\n8. [timers-tokio]\n\n---\n\nI want you to visit them and read posts for [sitemap] at [Steadylearner].\n\nI will suppose that you already have experience in Rust and other programming.\n\n\u003cbr /\u003e\n\n\u003ch2 class=\"blue\"\u003eTable of Contents\u003c/h2\u003e\n\n1. Separate functions to build sitemaps\n2. Use them inside fn main() with thread\n3. Interval and automation\n4. **Conclusion**\n\n---\n\n\u003cbr /\u003e\n\n## 1. Separate functions to build sitemaps\n\nWe will write **cargo.toml** first. Because we will use thread inside **main.rs**, we don't need other Rust bin files like we do in other post for [sitemap].\n\n```toml\n# cargo.toml, write the code similar to this\n\n# $cargo run --bin \u003cname\u003e will point to the path we define here\n# (Replace \u003cname\u003e to image-sitemap for this post)\n[[bin]]\nname = \"main\"\npath = \"src/bin/main.rs\"\n\n# lib is used like crate and can import and export other files in the same directory level.\n\n[lib]\nname = \"your_lib\"\npath = \"src/lib.rs\"\n```\n\nThen, we will define **image_stiemap_renewal** function in **sitemap_renew.rs** first.\n\nIt will be similar to\n\n```rust\nextern crate diesel;\nextern crate sl_lib;\n\nuse std::fs;\nuse std::fs::{write, File};\nuse std::fs::OpenOptions;\nuse std::io::prelude::*;\n\nuse chrono::{DateTime, FixedOffset, NaiveDate};\nuse sitemap::reader::{SiteMapEntity, SiteMapReader};\nuse sitemap::structs::{ChangeFreq, SiteMapEntry, UrlEntry};\nuse sitemap::writer::SiteMapWriter;\n\nuse sl_lib::models::*;\nuse sl_lib::*;\n\nuse diesel::prelude::*;\n\npub fn image_sitemap_renewal() -\u003e std::io::Result\u003c()\u003e {\n    use crate::schema::images::dsl::*;\n    let connection = init_pool().get().unwrap();\n\n    let image_results = images\n        .load::\u003cImage\u003e(\u0026*connection)\n        .expect(\"Error loading images\");\n\n    println!(\n        \"\\nIt starts to write image_sitemap.xml for {} images\",\n        image_results.len()\n    );\n\n    // https://support.google.com/webmasters/answer/178636?hl=en\n\n    let start_xml = r#\"\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003curlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\" xmlns:image=\"http://www.google.com/schemas/sitemap-image/1.1\"\u003e\n  \u003curl\u003e\n    \u003cloc\u003ehttps://www.steadylearner.com\u003c/loc\u003e\n\"#;\n\n    fs::write(\"image_sitemap.xml\", \u0026start_xml)?;\n    let mut result = OpenOptions::new().append(true).open(\"image_sitemap.xml\").unwrap();\n\n    let mut image_xml = String::new();\n    for image in image_results {\n        let image_url = format!(\n            \"    \u003cimage:image\u003e\n      \u003cimage:title\u003e{}\u003c/image:title\u003e\n      \u003cimage:caption\u003e{}\u003c/image:caption\u003e\n      \u003cimage:loc\u003ehttps://www.steadylearner.com{}\u003c/image:loc\u003e\n      \u003cimage:license\u003ehttps://www.steadylearner.com\u003c/image:license\u003e\n    \u003c/image:image\u003e\n\",\n            image.title,\n            image.content,\n            image.media_url,\n        );\n        image_xml.push_str(\u0026image_url);\n    }\n\n    if let Err(e) = writeln!(result, \"{}{}\", image_xml , r#\"  \u003c/url\u003e\n\u003c/urlset\u003e \"#) {\n        eprintln!(\"Couldn't write to file: {}\", e);\n    }\n\n    println!(\"image_sitemap.xml was built. Include it to main sitemap.xml.\");\n\n    Ok(())\n}\n```\n\nOnly location and name of the function are different from [the previous posts][sitemap].\n\nThen, we will build **sitemap_txt_renewal** function that will use **sitemap.xml** made from **sitemap_renewal** function later.\n\n```rust\nfn sitemap_txt_renewal() -\u003e std::io::Result\u003c()\u003e {\n    let mut urls = Vec::new();\n    let mut sitemaps = Vec::new();\n    let mut errors = Vec::new();\n    let file = File::open(\"sitemap.xml\").expect(\"Unable to open file.\");\n    let parser = SiteMapReader::new(file);\n    for entity in parser {\n        match entity {\n            SiteMapEntity::Url(url_entry) =\u003e {\n                urls.push(url_entry);\n            },\n            SiteMapEntity::SiteMap(sitemap_entry) =\u003e {\n                sitemaps.push(sitemap_entry);\n            },\n            SiteMapEntity::Err(error) =\u003e {\n                errors.push(error);\n            },\n        }\n    }\n\n    // println!(\"payload = {:?}\", urls[0].loc.get_url().unwrap());\n\n    let mut output = String::new();\n    output.push_str(\"http://www.steadylearner.com/video/search/*\nhttp://www.steadylearner.com/video/watch/*\nhttp://www.steadylearner.com/video/write/*\nhttp://www.steadylearner.com/image/search/*\nhttp://www.steadylearner.com/blog/search/*\nhttp://www.steadylearner.com/blog/read/*\nhttp://www.steadylearner.com/code/search/*\nhttp://www.steadylearner.com/static/images/*\n\");\n\n    for url in urls {\n        let payload = url.loc.get_url().unwrap();\n        println!(\"{}\", \u0026payload);\n        let payload_with_new_line = format!(\"{}\\n\", \u0026payload);\n        output.push_str(\u0026payload_with_new_line);\n    }\n\n    println!(\"{:#?}\", \u0026output);\n    write(\"sitemap.txt\", \u0026output)?;\n\n    println!(\"errors = {:?}\", errors);\n\n    Ok(())\n}\n```\n\nYou may use all selector * and others before you make **sitemap.txt** from **sitemap.xml**.\n\nLastly, our **sitemap_renewal** function will be similar to\n\n```rust\npub fn sitemap_renewal(static_routes: Vec\u003c\u0026str\u003e, paths_for_other_sitemaps: Vec\u003c\u0026str\u003e) -\u003e std::io::Result\u003c()\u003e {\n    // Use database with Rust diesel to write sitemap.xml first\n    use crate::schema::posts::dsl::*;\n    let connection = init_pool().get().unwrap();\n\n    let post_results = posts\n        .filter(published.eq(true))\n        .order(created_at.desc())\n        .load::\u003cPost\u003e(\u0026*connection)\n        .expect(\"Error loading posts\");\n\n    println!(\n        \"\\nIt starts to write sitemap.xml for {} posts\",\n        post_results.len()\n    );\n\n    let mut output = Vec::\u003cu8\u003e::new();\n    {\n        let sitemap_writer = SiteMapWriter::new(\u0026mut output);\n\n        let mut urlwriter = sitemap_writer\n            .start_urlset()\n            .expect(\"Unable to write urlset\");\n\n        let today = what_is_date_today();\n\n        let date = DateTime::from_utc(\n            NaiveDate::from_ymd(today.year, today.month, today.day).and_hms(0, 0, 0),\n            FixedOffset::east(0),\n        );\n\n        let home_entry = UrlEntry::builder()\n            .loc(\"http://www.steadylearner.com\")\n            .changefreq(ChangeFreq::Monthly)\n            .lastmod(date)\n            .build()\n            .expect(\"valid\");\n        urlwriter.url(home_entry).expect(\"Unable to write url\");\n\n        for route in static_routes.iter() {\n            let static_url = format!(\"http://www.steadylearner.com/{}\", route);\n            let url_entry = UrlEntry::builder()\n                .loc(static_url)\n                .changefreq(ChangeFreq::Monthly)\n                .lastmod(date)\n                .build()\n                .expect(\"valid\");\n\n            urlwriter.url(url_entry).expect(\"Unable to write url\");\n        }\n\n        for post in post_results {\n            let post_url = format!(\n                \"http://www.steadylearner.com/blog/read/{}\",\n                post.title.replace(\" \", \"-\")\n            );\n            let url_entry = UrlEntry::builder()\n                .loc(post_url)\n                .changefreq(ChangeFreq::Yearly)\n                .lastmod(date)\n                .build()\n                .expect(\"valid\");\n\n            urlwriter.url(url_entry).expect(\"Unable to write url\");\n        }\n\n        // assigining value to sitemap_writer is important to make the next process work\n        let sitemap_writer = urlwriter.end().expect(\"close the urlset block\");\n\n        // You may use this if you have many sitemaps.\n\n        // let mut sitemap_index_writer = sitemap_writer\n        //     .start_sitemapindex()\n        //     .expect(\"start sitemap index tag\");\n\n        // for path_for_other_sitemap in paths_for_other_sitemaps {\n        //     let entire_path_for_other_sitemap =\n        //         format!(\"https://www.steadylearner.com/{}\", path_for_other_sitemap);\n\n        //     let sitemap_entry = SiteMapEntry::builder()\n        //         .loc(entire_path_for_other_sitemap)\n        //         .lastmod(date)\n        //         .build()\n        //         .expect(\"valid\");\n\n        //     sitemap_index_writer\n        //         .sitemap(sitemap_entry)\n        //         .expect(\"Can't write the file\");\n        // }\n\n        // sitemap_index_writer.end().expect(\"close sitemap block\");\n    }\n\n    write(\"sitemap.xml\"::, \u0026output)?;\n    sitemap_txt_renewal();\n\n    Ok(())\n}\n```\n\nIts purposes are\n\n1. build **sitemap.xml** from static routes and the datas from the database\n2. link another sitemap such as **image_sitemap.xml** we built before\n3. and make **sitemap.txt** from **sitemap.xml**\n\nIf you read the code snippets from the previous post for [sitemap], we removed almost all **println!** to show results at console.\n\nYou may not need them when what you want is automation and already know what the code snippets do here.\n\nWe organized all our codes into separate functions. They became reusable and can be used everywhere.\n\n\u003cbr /\u003e\n\n## 2. Use them inside fn main() with thread\n\nWe made functions to build sitemaps. We will call them inside **main.rs** to make them build whenever we run our website.\n\nYou won't want it to affect the main process, so you may separate it inside another **thread**.\n\nFor that, **main.rs** should be similar to this.\n\n```rust\nuse std::thread;\n\nmod sitemap_renew;\nuse crate::sitemap_renew::{sitemap_renewal, image_sitemap_renewal};\n\nfn main() {\n     thread::Builder::new()\n        .name(\"Build sitemap automatically with Rust\".into())\n        .spawn(|| {\n            image_sitemap_renewal()?;\n            let static_paths = vec![\n                \"about\",\n                \"video\",\n                \"blog\",\n                \"code\",\n                \"image\",\n                \"slideshow\",\n            ];\n            let other_sitemaps = vec![\"image_sitemap.xml\"];\n            sitemap_renewal(static_paths, other_sitemaps)\n        })\n        .unwrap();\n\n    // your_website();\n}\n```\n\nWe invested our time to modulize our work to build sitemap and **main.rs** became simpler with thread.\n\nYou can test it with **$cargo run bin --main** and you will see files similar to [sitemap.xml], [txt sitemap] and [image sitemap] for [Steadylearner].\n\n\u003cbr /\u003e\n\n## 3. Interval and automation\n\nIf you want to automate the process, you may use **Interval** API from [futures-timer], [timers-tokio] etc.\n\nI had working code with [futures-timer] similar to\n\n```rust\nextern crate futures;\nextern crate futures_timer;\n\nuse std::time::Duration;\n\nuse futures::prelude::*;\nuse futures_timer::Interval;\n\nmod sitemap_renew;\nuse crate::sitemap_renew::{sitemap_renewal, image_sitemap_renewal};\n\nfn main() {\n    thread::Builder::new()\n        .name(\"Build sitemap automatically with Rust\".into())\n        .spawn(|| {\n            // { day: 86400, week: 604800, month: 2592000, }\n            Interval::new(Duration::from_secs(1))\n                .for_each(|()| {\n                    image_sitemap_renewal()?;\n                    let static_paths = vec![\n                        \"about\",\n                        \"video\",\n                        \"blog\",\n                        \"code\",\n                        \"image\",\n                        \"slideshow\",\n                    ]\n                    let other_sitemaps = vec![\"image_sitemap.xml\"];\n                    sitemap_renewal(static_paths, other_sitemaps)\n                })\n                .wait()\n                .unwrap();\n        })\n        .unwrap();\n\n    // your_website(); // It is arbitary name, don't give it importance.\n}\n```\n\nIts API was rewritten with the introduction of **async** in **Rust** and seem to be not working anymore.\n\nHowever, You may refer to the code snippet above and their documentations for your project.\n\nWith tokio, your main code would be similar to\n\n```rust\nextern crate tokio;\nuse tokio::prelude::*;\nuse tokio::timer::Interval;\n\nuse std::time::{Duration, Instant};\n\nuse std::threa\n\nfn main() {\n    thread::Builder::new()\n        .name(\"Build sitemap automatically with Rust\".into())\n        .spawn(|| {\n            // { day: 86400, week: 604800, month: 2592000, }\n            let task = Interval::new(Instant::now(), Duration::from_secs(604800))\n            .for_each(|instant| {\n                println!(\"fire; instant={:?}\", instant);\n                image_sitemap_renewal().expect(\"Error while making sitemap.xml for images\");\n                let static_paths = vec![\"about\", \"video\", \"blog\", \"code\", \"image\", \"slideshow\"];\n                let other_sitemaps = vec![\"image_sitemap.xml\"];\n                sitemap_renewal(static_paths, other_sitemaps);\n                Ok(())\n            })\n            .map_err(|e| panic!(\"interval errored; err={:?}\", e));\n\n            tokio::run(task);\n        })\n        .unwrap();\n\n    // your_website();\n}\n```\n\nWhen you run it with **$cargo run --bin main**, it will show log similar to this\n\n```console\nfire; instant=Instant\n\nhttp://www.steadylearner.com/\nhttp://www.steadylearner.com/about\nhttp://www.steadylearner.com/video\nhttp://www.steadylearner.com/blog\nhttp://www.steadylearner.com/code\nhttp://www.steadylearner.com/image\nhttp://www.steadylearner.com/slideshow\nhttp://www.steadylearner.com/blog/read/How-to-start-Rust-Chat-App\nhttp://www.steadylearner.com/blog/read/How-to-install-Rust\n\nerrors = []\n```\n\nor more with interval seconds you define.\n\nThe **thread** Rust API here is optional. It will work without it.\n\n\u003cbr /\u003e\n\n## 4. Conclusion\n\nThe latest code for [sitemap] can be found at [Steadylearner Sitemap] repository. It is the end of the posts for [sitemap] with Rust.\n\n**Thanks and please share this post with others.**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsteadylearner%2Fsitemap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsteadylearner%2Fsitemap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsteadylearner%2Fsitemap/lists"}