💾 Archived View for bootes.me › gemlogs › first-post.gmi captured on 2021-12-06 at 14:29:53. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
Before I got this rudimentary bihost setup worked out, I had put a few posts up on this capsule, which are now deleted. Mainly a few journal entries about my interests in programming, thoughs on anime I had watched recently, stuff like that. But last night I sat down and thought about what would be the easiest way to serve content over both gemini and the web. I thought about using Adele's PHP server[1], but I don't know any PHP so I thought that might be a little difficult to configure. Plus I already had this server running (the gemini side is being served by Agate) and hooked up to my domain name. I had thought before about using some kind of python based backend web framework like Flask or Bottle, but this also felt like too much for a static site that was only going to serve text content in the form of little markdown blog posts.
The way I ended up doing it was quite simple: I modified a python script I had previously written[2] to convert gemtext to HTML, and wrote another script that would recursively go through all the files and directories in my root gemini directory, and write converted HTML files to the /var/www/html html directory, which is being served by Apache.
#!/usr/bin/env python # importing required libraries import sys import re class Converter: def __init__(self, template=None): # A dictionary that maps regex to match at the beginning of gmi lines to their corresponding HTML tag names. Used by convert_single_line(). self.tags_dict = { r"^# (.*)": "h1", r"^## (.*)": "h2", r"^### (.*)": "h3", r"^\* (.*)": "li", r"^> (.*)": "blockquote", r"^=>\s*(\S+)(\s+.*)?": "a" } # File path for a template file containing the head of the HTML document self.template = template # This function takes a string of gemtext as input and returns a string of HTML def convert_single_line(self, gmi_line): for pattern in self.tags_dict.keys(): if match := re.match(pattern, gmi_line): tag = self.tags_dict[pattern] groups = match.groups() if tag == "a": href = groups[0] inner_text = groups[1].strip() if len(groups) > 1 else href return f"<{tag} href='{href}'>{inner_text}</{tag}>" else: inner_text = groups[0].strip() return f"<{tag}>{inner_text}</{tag}>" return f"<p>{gmi_line}</p>" # Reads the contents of the input file line by line and outputs HTML. Renders text in preformat blocks (toggled by ```) as multiline <pre> tags. def main(self, gmi_path, html_path): with open(gmi_path) as gmi, open(html_path, "w") as html: html.write("<html>\n") if self.template: with open(self.template) as tmpl: for line in tmpl: html.write(line) html.write("\n") preformat = False in_list = False html.write("<body>\n") for line in gmi: line = line.strip() if len(line): if line.startswith("```") or line.endswith("```"): preformat = not preformat repl = "<pre>" if preformat else "</pre>" html.write(re.sub(r"```", repl, line)) elif preformat: html.write(line) else: html_line = self.convert_single_line(line) if html_line.startswith("<li>"): if not in_list: in_list = True html.write("<ul>\n") html.write(html_line) elif in_list: in_list = False html.write("</ul>\n") html.write(html_line) else: html.write(html_line) html.write("\n") html.write("</body>\n") html.write("</html>") # Main guard if __name__ == "__main__": Converter.main(sys.argv[1], sys.argv[2])
I changed the converter into a class that could be imported into the script that would convert the whole file tree to HTML. I kept pellertson's contribution of surrounding <ul> tags in the output of the converter, but I decided to leave out the smart open to simplify things, since I figure I will probably not be running Converter.py directly from the command line very much. Another addition was the self.template attribute. Like the comments say, this represents a file path to a template for the <head> of the HTML doc. Here's what the head of all the html files on this site looks like as of now for those of you reading this on gemini who can't inspect element:
<head> <!-- This would be a good place to add some JS to execute after the page loads for a more rich experience when viewing on the web --> <meta charset="utf-8"/> <meta name="author" content="Hunter"/> <meta name="description" content="A blog about technology, served over gemini and the web."/> <meta name="language" content="en-US"/> <meta name="viewport" content="width=device-width, initial-scale=1"/> <link rel="stylesheet" href="/styles/style.css" type="text/css"/> </head>
So, pretty simple so far. This lets me add some basic metadata and link the universal stylesheet for the site. Right now, the stylesheet only has a couple rules I added for test purposes and it's pretty ugly. Better styles coming soon. I might also want to cook up some JS to execute after the page loads to make the web browsing experience a little more interactive. I like how Adele's homepage has a navbar and sidenav surrounding the gemini content, and has a little slider where you can adjust the size of the text. Something like that would be cool. I might also want to eventually have multiple templates, say I want gemlog posts specifically to look a certain way to differentiate them from other pages. I might also want to have a section of the site that would be only viewable on the web, dedicated to little JS projects and toys. In any case, I want to make sure that the main content of the site - the mirrored gemini content - is always totally viewable and navigable without people having to load any JS. I also want to keep the JS simple and vanilla, I don't really want to mess around with using other people's npm packages or fetching resources from external API's if I can avoid it. It would be better to keep everything very simple, minimal, and basic.
Here is the script that goes through the gemini directory and converts all of it to HTML. I used the pathlib module because that's what I am the most familiar with for doing these kinds of operations, but I'm sure there is probably a niftier way to do this in python that I'm not aware of. It works for now:
from pathlib import Path from Converter import Converter root_html_dir = "/var/www/html" root_gmi_dir = "/home/hunter/gemini" template = "/var/www/html/templates/template.html" converter = Converter(template) def sync_tree(gmi_dir, html_dir): gmi_dir_path = Path(gmi_dir) html_dir_path = Path(html_dir) if not html_dir_path.exists(): Path.mkdir(html_dir_path) for child in gmi_dir_path.iterdir(): gmi_child = child.as_posix() html_child = gmi_child.replace(root_gmi_dir, root_html_dir) html_child = html_child.replace(".gmi", ".html") if child.is_dir(): sync_tree(gmi_child, html_child) elif child.is_file(): converter.main(gmi_child, html_child) sync_tree(root_gmi_dir, root_html_dir)
I realize that a lot of this could probably be done in a much more linux-y way using shell utilities like grep, sed, and awk, but to be honest, I do not know very much about shell scripting. Again I used python because that's the language I know the best but I'm sure there are simpler ways to do what I'm doing here. I also don't know if converting the whole file tree like this is the best way to update my website. I might only edit a few files but have to rewrite the entire site's HTML, and as it gets bigger this might become a pain. It would be better if I could just write a gemlog post, and then when I save the file have it automatically convert only that file. I could do a workaround in python but there's probably a good shell utility to do solve this problem and I figure I should learn what it is and how to use it instead.
I look forward to using this setup as a notepad to express my thoughts and document my projects.